24 - Artificial Intelligence II [ID:57526]
50 von 618 angezeigt

Welcome to the last AI2 lecture.

Of course, there's going to be another AI2 lecture next year, but for this year.

We talked about natural language processing.

We've kind of worked our way up the ladder of complexity.

And I've shown you yesterday.

Quite a lot of little bits and pieces essentially.

Feeding off the idea of language models, ie.

Probability distributions that can be kind of assessed out of corpora.

And we are doing, in that we're doing heavy duty statistical computations.

Essentially, we need the corpora as kind of the models for language

from which we derive the probability distribution.

And part of the work was kind of seeing how from those existing distributions,

we can do various applications like part of speech tagging and machine translation

and those kind of things.

But also how to condense the data.

Remember these.

That was the last bit we looked into.

Using grammars with a probability component that

allow us to represent these probability distributions more efficiently,

more space efficiently at the cost of having to do parsing just to compute the probabilities.

There is, and that's by now or currently,

the dominant paradigm is really not to go directly via the statistics,

but to learn these things using deep neural networks.

So today's chapter is doing that.

And there is a huge amount of research there.

A good deal of that behind company firewalls.

So I'm going to kind of do a whirlwind tour about that.

Kind of giving you little bits and pieces and hints about these things.

There are full courses on this that give you the details.

Especially in Informatics 5, the pattern recognition chair.

They do these things for real.

So we're going to look at things like word embeddings.

We're going to look at recurrent neural networks.

Remember those were the networks with cycles that kind of as little bits and pieces have their place.

In these big neural networks that are still have lots of feed forward aspects,

but have kind of small components in them that are recurrent.

We're going to look at sequence to sequence models for machine translation and transformer architectures.

And then finally, we'll have a glimpse at how LLMs work internally.

As I said, I won't be able to go into any detail here.

The first thing and one of the very influential things is word embeddings.

Remember, if you have a neural network, then that's something that has N inputs, typically for relatively large N.

Various hidden layers of neurons and then has some kind of an output layer.

So if you want to use this architecture to have inputs that are not just vectors, vectors of zeros and ones essentially,

then you have to do something to make the object you want to use as the argument part of an example, the input part of an example.

You have to make it into a vector.

And for some things, that's easy.

If you have an image, that's what happens for learning letters and handwritten letters and digits.

You can just basically, you have a 20 times 20 image that gives you 400 black and white dots.

You just basically concatenate every row and that gives you a 400 vector that you just map to the inputs.

And you forget that for us, it's kind of a square.

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:31:22 Min

Aufnahmedatum

2025-07-23

Hochgeladen am

2025-07-24 14:29:06

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen